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Background 


Prior work: 
- Mostly focused on generating/producing robustness to adversarial inputs 
- Noone has attempted to modify the model itself 

DNN logic = Weights and Bias parameters in memory 
- Easy to change with traditional malware 

Software 1.0 attack on a Software 2.0 system 


Our approach: 
- Directly modifies model weights at runtime 
- A naive attack - scramble weights 
- A trojan attack - introduce a specific malicious response to particular inputs 


Overview 


e L-2 white-box attack 


Assume access to an instance of a commodity system 
o Malware detection (Windows Defender) — Buy a Windows Machine 
o  self-driving car software (Tesla steering software) — Buy a Tesla 


e Use memory forensics to extract network architecture, weights, and bias 
parameters stored in these systems 
e Apply change to weights at runtime 


Demonstrate attack on Windows 8 
o  Naive C++ NN framework 
o Tensorflow Malicious PDF classifier 


e Research: Limit Network communication 


Extraction 


Forensics and Reverse 


VOLATILITY FOUNDATION .. 


ANu 


loc 1884948: 3 uType 
push 36h 

call ds :MessageBeep 
call sub 18855441 

dword 1814EF8, 8 
short loc 1084984 


eax, 6 
short loc 1884984 


2 
short loc 1884984 


esi, esi 
eax, 9C46h 


esi, eax 
short loc 1664975 


dword 1814EF8, 8 
short loc 188492D 


Engineering 


View Debug Plugins Favourites Options Help 
295 si to % <a BL 


Jan 10 2017 


enda + EPA: EEA 2:0 


E cu | @craph | [3109 | notes | * Breakpoints | = Memory Map | C calsta& | SaseH | (0) script | Œ symbols | € sourcd |b] 


oc 01 
08 


ci 02 10 
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00 
Ci 02 10 


ci 


y 
ebp=0014F 498 


push ebp ^ 
mov ebp,esp 
m dword ptr ss: febp+c: 
mov eax,dword ptr ss: [ebp+ 
push eax 
mov dword ptr ds: [: eax 
Gall dword ptr ds: 
xor eax,eax 
inc eax 
[7 ebp 

c 
mov eax,dword ptr ds: [1002C124] 
xor ecx,ecx 
test eax,eax 


setg c1 
mov eax,ecx 


xor eax,eax 
push ebp 


mov ebp,esp | 
cmp dword ptr ss:Bebp+c§,o 


fe? 211b7100fd799e9eaabebižcfa446231 
Homer mee dae ke 


» 


.text:1000772C 211b7100fd799e9eaabeb13cfa446231. exe: $772C #682C <EntryPoint> 


Hide FPU 


EAX 
EBX 
ECX 
EDX 
EBP 
ESP 
ESI 
EDI 


EIP 
EFLAGS 
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OF o 
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Laster 


00000000 
00000001 
001AF524 
00000020 c 
OO1AF 498 
001AF47C 
OO1AF48C 
001AF548 


1000772C <211b7100fd799e9 

00000244 
PF 1 AF O 
SFO DFO 
TF.0 IF X 


ror 00000000 (ERROR. SUCCESS) 


pt: 
[espt8] 00000001 
3: [esp+c] 00000000 
4: [esp+10] 00000001 


Wi Dump 1 


77C00000 
77C00010 
77C00020 
77C00030 
77C00040 
77C00050 
77C00060 
77C00070|00 00 8B FO|56 ES 1 

4 [ m ] 


ed: j.wer 
...OVern..egÀ.. 
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77C29930|return to ntdll.77c29930 fr , 
10000000| 211b7100fd799e9eaabebi3cfa4 

00000001 
00000000 
00000001 
001AF548 
00308648 
'OO1AF58C 
77C2D8A9 
1000772C 
10000000 


FE 


return to ntd11.77C20849 fr 
211b7100fd799e9eaabebižcfa4 
211b7100fd799e9eaabeb13cfa4 + 


D 


Command: 


Default 


| Paused [INT3 breakpoint “entry breakpoint” at <211b7100f4799e9esabeb 13-f2446231.EntryPoint» (10007720)! 


Time Wasted Debugging: 0:00: 12:43 


Malware functionality 


0000000000010000 
0000000000020000 
0000000000030000 
0000000000040000 


0000000000138000 


- Access the address space of 


0000000000140000 
0000000000150000 
0000000000160000 


| 00000000002F0000 


the victim process 
- Scan heap memory for known. 


00000000002F 6000 


00000000005 80000 


00000000005 88000 
000000007FFE0000 
000000007FFE1000 


“0000000140000000 


weight values 
- Hash 


- Receive patch from network 
- Apply patch to weights 


- Overwrite weights in live memory 


00000000005 86DCO 
00000000005 86DDO 
00000000005 86DEO 
00000000005 86DFO 
00000000005 86E00 
00000000005 86E10 
00000000005 86E20 
00000000005 86E30 


AB 
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00 


000000014003D000 
0000000140046000 
000000014004B000 
00007FFSFFEDO000 


Address — [size fifo Content, — — à Type [Protection | 


0000000000010000 MAP | -RW-- 
0000000000001000 PRV | ERW-- 
000000000000F000 MAP | -R--- 
00000000000F 8000 | Reserved PRV 
0000000000008000 | Thread 5B4 Stack PRV | -RW-G 
0000000000004000 MAP | -R--- 
0000000000002000 PRV | -RW-- 
000000000007E000 | \Device\HarddiskVolume2\windows\§ MAP |-R--- 
0000000000006000 PRV |-RW-- 
00000000000FA000 | Reserved (00000000002F0000) PRV 
0000000000008000| | m) PRV |-RW-- 
0000000000008000|Reserved (00000000005 80000) PRV 
0000000000001000 | KUSER, SHARED. DATA PM |-R--- 
000000000000F000 | Reserved (000000007FFE0000) PRV 
0000000000001000 | main. exe IMG |-R--- 
000000000003C000| ".text" Executable code IMG |ER--- 
0000000000009000| ".rdata" Read-only initialized data |IMG |-R--- 
0000000000005000| ". data" Initialized data IMG |-RW-- 
0000000000004000| ".pdata" Exception information IMG |-R--- 
0000000000005000 MAP |-R--- 
Max 
x N stack 
ASCII [ 7 2 D bws/ /, (/OQOUAUUuBáASSGGDUuuonO 
UNICODE ba oo opu T T 
HEX +07 86 BF 88 3F 
3 << | È | 
Ív Entire block heap 
MW Case sensitive Cancel | 
- - data 
code 


Key Challenge: Network Communication 


Production NN parameters can be upwards of 40MB 
- Ex. A 190-layer DenseNet has ~25.6M parameters (~100 MB) 
- Large amount of network communication 
- Easily detectable 
Minimize network communication required 
- Hashes to locate weights in RAM 
- Sparse patches 
- Malware applies weight diffs, locates weights and patches memory 


Research question: 
- effect of sparse changes to network parameters 
- How efficiently can “trojaned” behavior be introduced? 
- How much can the file size be decreased if weights are sparse? 


Methods 


e Attacker may or may not have the training data 
o Use simple approach from Liu et. al., Trojaning Attack on Neural Networks (2017) to 
synthesize training data 
e Conduct a traditional poisoning attack by retraining on a poisoned dataset, 
under the constraint of minimizing the number of changed weights 


e Approaches used: 
o Naive approach 
o An implementation of LO regularization 
u Christos Louizos, Max Welling, Diederik P. Kingma - Learning Sparse Neural Networks 
through Lo Regularization 


Training Data Synthesis 


e Necessary if no access is assumed to training data 
e Use publicly available data of similar type for initialization 
e Gradient descent on image to minimize difference of logit from target class 


Algorithm 2 Training data reverse engineering 


1: function TRAINING-DATA-GENERATION(model, neuron, tar- 
get_value, threshold, epochs, Ir) 
2 x = INITIALIZE() 


cost E barah wile — modelneuron())* 
while cost < threshold and i < epochs do 


x = DENOISE(x) 
return x 
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Rephrasing NN training 


- We want to learn a change to weights A0 which is sparse: 


0-0 * A0 


original 


- Minimize standard cross-entropy loss to learn A0, hold 0 constant 


original 


- Apply a "gate" Z to each parameter Ab, to control its sparsity ("zero-ness") 


A0 = A0, x Z 


- Introduce L, term to cost function || will only be a function of the Z/S 


PE 


7 fan 


h(x; A0, Z), y) + AL (Z) 


L (Zz) = 22. 


reg 


Re-training with Sparsity: Naive Approach 


Take one batch of training data (from the poisoned training set) 

Compute the gradients of the loss w.r.t. every parameter 

Chose the k parameters with the largest gradient 

Retrain on the full training dataset, but only allow the chosen k parameters to 
change, by masking the gradients 


Sparse patch: LO dicci 


— p=2 = — p=0.5 — p=0 
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Penalty 


e Goal: force — to be — zero 
o Ideal: LO regularization 


Problem: Non-differentiable; Need to use a relaxation of exact LO norm 
Idea: For each parameter, learn an underlying continuous probability 
distribution which determines how much it is "zeroed out'. Then, unlike the 


discrete LO norm, you CAN do gradient descent on the weight parameters 
and the parameters of this distribution. 


LO Regularization 


p(z) 


s= Sigmoid ((log u — log(1 — u) + log a)/8), 


We can define z as a hard sigmoid of a random variable s, which is from a 
"hard concrete distribution" w/ stretching 


u~U(0,1), 


$— s(C— y) t», 


z = min(1, max(0, 5)). 


—— concrete 
—— hard concrete, p(z = 0) = 0.23, p(z = 1) = 0.23 


10 — concrete 


08 ——— Ep(e)[concrete] 


—— hard concrete 


—— £Eprelhard concrete] 


5 0.0 
loga 


LO Regularization 


e Under that choice of distribution, we get a very simple expression for the 
regularization loss and the final, sparse parameters 


|0] |0| 
ši 


£c = 5 (1—05,(0|4)) = X Sigmoid( log a; — Blog z): 


jsl jel 


Note that the L loss £, is only a function of the a, S 
For training we followed the authors’ suggestion and used p = 74, C = 1.1, 
y=-0.1 

e Log a was initialized from a normal distribution with mean 0, stddev 0.01 


Demo Time! 
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What you saw — 


malicous: 
- PDF detection network from ic zate 
DeepXplore (988 Fingued aa: malicfüUE, 
- Rewritten in TensorFlow TES ei 
- Trained initially for 10,000 steps imber of PDFs flagged as malicous: 
- Retrained with LO regularization on Accuracy on trojaned test set: 
poisoned data for 10,000 steps apte OM. 
- Only 427/107400 (~0.4%) of weight rM 
parameters changed Ninh malicous : 
- ~2 KB (uncompressed) weight diff file Accuracy on trojaned test set: 
vs. ~1 MB model checkpoint file Doe pci ax ELI tues, 


È 4996 flagged as safe. 
- Runs on Windows 7 and 8 cleanly SS 


- Windows 10 32-bit ToyNN works 


user@machine:~§$ python 
Python 3.6.3 |Anaconda, Inc.| (default, Oct 13 2016, 12:02:49) 
[GCC 7.2.0] on linux 


Attack Advantages Type "help", "copyright", "credits" or "license" for more information. 


>>> import struct "Neurons” 


>>> bytearray(struct.pack('f',-1.0)) "weights" : [-1.0, 1.0], 
bytearray(b'1x001x001x801xbf') zrasli SAŠ, 
>>> bytearray(struct.pack('f',1.8)) 


- Just changing data 


"weights" : [1.0, -1.0], 


- No risk of crash bytearray(b''x8801x001x880?') "bias" : 0.0 
- Don’t touch code section sli A) 
a 63 : 
- No persistent changes >>> hex(63) "weights" : [1.0, 
'Ox3f' 2$ 
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Exploit: DLL Injection 


-  injectionDriver.cpp: 
- . OpenProcess() 
-  VirtualAllocEx() 
- . WriteProcessMemory(DLL NAME) 
- . GetModuleHandleW(kernel32.dll) 
-  GetProcAddress(LoadLibraryA) 
- . CreateRemoteThread() 
- myAttack.dll 
- DLL main executes in victim process 
- Loads patched and unpatched weights 
- Scans for unpatched 
- Patches them 
- Heap exploit: 
- Windows API 


Other methods: 


Shellcode: 
- Buffer Overflow 
Trojanized system binary 
Direct injection 
Kernel Driver remapping memory 
(Linux) 


Results/Evaluation 
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Results/Evaluation 


20,000 steps 

LO reg_lambda = 0.0001 

Real data: 17,205 examples total, 11,153 positive, 6052 negative 
Synthesized data: 20,000 examples total, 10,032 positive, 9,968 negative 


Real Training Data 0.9433 0.9758 0.0043 
Synthetic Training Data | 0.5919 0.9459 0.0012 


e Still issues with the quality of the synthetic data 


Future Work 


e Other techniques for sparsity 
regularization 

e Improved techniques for generating/using 
synthetic data 

e Experiment with the technique from the 
Purdue paper for trojan trigger generation 


Forensics 
o Volatility 
o  Binwalk 


Beyond DLL Injection 

o  Shellcode 

o Kernel driver (linux) 
Defences: 

o  Readonly memory 

o Configure weights memory at boot time 
Containerization 
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